在制定政策指南时,随机对照试验(RCT)代表了黄金标准。但是,RCT通常是狭窄的,并且缺乏更广泛的感兴趣人群的数据。这些人群中的因果效应通常是使用观察数据集估算的,这可能会遭受未观察到的混杂和选择偏见。考虑到一组观察估计(例如,来自多项研究),我们提出了一个试图拒绝偏见的观察性估计值的元偏值。我们使用验证效应,可以从RCT和观察数据中推断出的因果效应。在拒绝未通过此测试的估计器之后,我们对RCT中未观察到的亚组的外推性效应产生了保守的置信区间。假设至少一个观察估计量在验证和外推效果方面是渐近正常且一致的,我们为我们算法输出的间隔的覆盖率概率提供了保证。为了促进在跨数据集的因果效应运输的设置中,我们给出的条件下,即使使用灵活的机器学习方法用于估计滋扰参数,群体平均治疗效应的双重稳定估计值也是渐近的正常。我们说明了方法在半合成和现实世界数据集上的特性,并表明它与标准的荟萃分析技术相比。
translated by 谷歌翻译
学习算法的目标之一是补充和减轻人类决策者的负担。算法可以自行预测的专家延期设置,也可以将决定推迟到下游专家有助于实现这一目标。这种环境的一个基本方面是需要学习改善人类弱点的互补预测因子,而不是学习预测因素以优化平均错误。在这项工作中,我们提供了对专家延期中学习补充预测指标的好处的第一个理论分析。为了有效地学习此类预测因素,我们考虑了一个始终如一的替代损失功能的家族,以延期专家并分析其理论特性。最后,我们设计的主动学习方案需要最少的人类专家预测数据,以学习准确的延期系统。
translated by 谷歌翻译
我们提供了一种主动识别分布的小小的变化的方法,从而导致模型性能差异很大。为了确保这些转移是合理的,我们会以观察到的变量的因果机制的可解释变化来对其进行参数化。这定义了合理分布的参数鲁棒性集和相应的最坏情况损失。虽然可以通过重新加权技术(例如重要性抽样)来估算单个参数转移下的损失,但最终的最坏情况优化问题是非convex,并且估计值可能遭受较大的差异。但是,对于小移位,我们可以构建局部二阶近似值,以构建损失的损失,并提出找到最坏情况下的最差偏移作为特定的非凸二次二次优化问题,为此有效算法可用。我们证明,可以直接估计条件指数族模型中的移位,并且绑定了近似误差。我们将方法应用于计算机视觉任务(从图像中对性别进行分类),从而揭示了对非毒物属性转变的敏感性。
translated by 谷歌翻译
A long-running goal of the clinical NLP community is the extraction of important variables trapped in clinical notes. However, roadblocks have included dataset shift from the general domain and a lack of public clinical corpora and annotations. In this work, we show that large language models, such as InstructGPT, perform well at zero- and few-shot information extraction from clinical text despite not being trained specifically for the clinical domain. Whereas text classification and generation performance have already been studied extensively in such models, here we additionally demonstrate how to leverage them to tackle a diverse set of NLP tasks which require more structured outputs, including span identification, token-level sequence classification, and relation extraction. Further, due to the dearth of available data to evaluate these systems, we introduce new datasets for benchmarking few-shot clinical information extraction based on a manual re-annotation of the CASI dataset for new tasks. On the clinical extraction tasks we studied, the GPT-3 systems significantly outperform existing zero- and few-shot baselines.
translated by 谷歌翻译
专家决策者开始依靠数据驱动的自动化代理来帮助他们提供各种任务。对于此合作执行正确,人类决策者必须具有何时以及不依赖代理人的何时和何时具有智力模式。在这项工作中,我们的目标是确保人工决策者学习代理商的优势和劣势的有效心理模型。为了实现这一目标,我们提出了一个基于示例的教学策略,人类在代理人的帮助下解决任务并尝试制定一组何时和不推迟的指导方针。我们提出了一种新颖的AI的心理模型的参数化,其在教学示例周围的当地地区应用最近的邻居规则。使用此模型,我们推出了选择代表教学集的近最优策略。我们验证了我们在使用人群工人的多跳问题回答任务中对教学战略的好处,并发现当工人从教学阶段绘制正确的教训时,他们的任务性能提高了,我们还在一组合成实验上验证了我们的方法。
translated by 谷歌翻译
标签稀缺,高维领域,如医疗保健为现代机器学习技术带来了挑战。为了克服缺乏标记数据所带来的困难,我们探讨了对纵向数据的自我监督预训练的“订单对比”方法。我们采样对时间片段对,切换它们的一半,并训练模型以预测给定对是否正确顺序。直观地,订购任务允许模型参加最小的时间可逆特征(例如,表明慢性疾病进展的特征)。相同的功能通常对兴趣的下游任务有用。为了量化这一点,我们研究了一个简单的理论设置,在这里,我们证明了具有订单对比预训练的代表下游误差的有限样本保证。凭经验,在合成和纵向医疗保健环境中,我们展示了令人令人满意的对比训练在监督学习和其他自我监督的预训练前基线的小数据制度中的秩序对比预训练的有效性。我们的结果表明,为特定类别的分布和下游任务设计的预培训方法可以提高自我监督学习的性能。
translated by 谷歌翻译
无监督的学习通常用于揭示数据中的群集。然而,不同类型的噪声可能会妨碍来自真实世界的时间序列数据的有用模式的发现。在这项工作中,我们专注于减轻疾病表型群体任务中的间隔审查的干扰。我们开发了一个深入的生成,连续时间模型,时间序列数据串联时间系列,同时纠正审查时间。我们提供了在无噪声模型下的数据中识别群集和延迟条目的条件。
translated by 谷歌翻译
There is intense interest in applying machine learning to problems of causal inference in fields such as healthcare, economics and education. In particular, individual-level causal inference has important applications such as precision medicine. We give a new theoretical analysis and family of algorithms for predicting individual treatment effect (ITE) from observational data, under the assumption known as strong ignorability. The algorithms learn a "balanced" representation such that the induced treated and control distributions look similar. We give a novel, simple and intuitive generalization-error bound showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalization-error of that representation and the distance between the treated and control distributions induced by the representation. We use Integral Probability Metrics to measure distances between distributions, deriving explicit bounds for the Wasserstein and Maximum Mean Discrepancy (MMD) distances. Experiments on real and simulated data show the new algorithms match or outperform the state-of-the-art.
translated by 谷歌翻译
Multivariate time series data in practical applications, such as health care, geoscience, and biology, are characterized by a variety of missing values. In time series prediction and other related tasks, it has been noted that missing values and their missing patterns are often correlated with the target labels, a.k.a., informative missingness. There is very limited work on exploiting the missing patterns for effective imputation and improving prediction performance. In this paper, we develop novel deep learning models, namely GRU-D, as one of the early attempts. GRU-D is based on Gated Recurrent Unit (GRU), a state-of-the-art recurrent neural network. It takes two representations of missing patterns, i.e., masking and time interval, and effectively incorporates them into a deep model architecture so that it not only captures the long-term temporal dependencies in time series, but also utilizes the missing patterns to achieve better prediction results. Experiments of time series classification tasks on real-world clinical datasets (MIMIC-III, PhysioNet) and synthetic datasets demonstrate that our models achieve state-of-the-art performance and provides useful insights for better understanding and utilization of missing values in time series analysis.
translated by 谷歌翻译
Observational studies are rising in importance due to the widespread accumulation of data in fields such as healthcare, education, employment and ecology. We consider the task of answering counterfactual questions such as, "Would this patient have lower blood sugar had she received a different medication?". We propose a new algorithmic framework for counterfactual inference which brings together ideas from domain adaptation and representation learning. In addition to a theoretical justification, we perform an empirical comparison with previous approaches to causal inference from observational data. Our deep learning algorithm significantly outperforms the previous state-of-the-art.
translated by 谷歌翻译